Erratum: sample size determination for the false discovery rate

نویسندگان

  • Stan Pounds
  • Cheng Cheng
چکیده

We have made corrections to the routines that were provided to implement Pounds and Cheng (2005) method to determine the sample size for a microarray experiment that uses the false discovery rate as the ultimate measure of statistical significance. Some routines in the original R and S-plus libraries did not properly account for differences between the definition of the noncentrality parameter in Equations (18) and (19) of Pounds and Cheng (2005) and the definition of the non-centrality parameter used by the internal R or S-plus function to evaluate the cumulative distribution function of the non-central F-distribution. A corrected version of the R routine library is now available for download from www.stjuderesearch.org/depts/biostats/fdrsampsize/index.html. To avoid confusion, all routines in the revised library use the definition of Equations (18) and (19) of Pounds and Cheng (2005). For a variety of settings involving a single two-sided test, the accuracy of the revised routine library has been checked by comparing the results of the avepow.oneway routine to the results of proc power in SAS and the built-in R function power.anova.test (Appendix). Corrected results of the simulation studies of Pounds and Cheng (2005) are reported in Tables 1 and 2 and Figure 1. Table 1 gives the simulation estimate of the expected value (EV) of the average power, i.e. the EV of the ratio D of the number of true discoveries to the number of false null hypotheses when sample size is determined using the true values of the proportion π of null hypotheses that are true and the effect size η of the false null hypotheses. The SD of D observed over 1000 simulation replications is also reported. In all but two settings, the determined sample size gives an estimate of the average power that is greater than or equal to the desired average power δ. In two settings, the simulation estimate of average power is slightly less than the desired average power. Table 2 reports the corrected results of a series of simulations that generated F-statistics for a background study with per-group sample size 4 from the assumed setting. The background F-statistics, instead of the actual effect size parameters, were used to determine sample with the method of Pounds and Cheng (2005). In all settings, the simulation estimate of the average power exceeded the desired average power δ. Figure 1 gives corrected results for the ‘real data simulation’ performed by resampling from a real data set that is described in section 4.2 of Pounds and Cheng (2005). The mean of the SPLOSH

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sample size determination for the false discovery rate

MOTIVATION There is not a widely applicable method to determine the sample size for experiments basing statistical significance on the false discovery rate (FDR). RESULTS We propose and develop the anticipated FDR (aFDR) as a conceptual tool for determining sample size. We derive mathematical expressions for the aFDR and anticipated average statistical power. These expressions are used to dev...

متن کامل

Quick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Analysis

1 Summary. Sample size estimation is important in microarray or proteomic experiments since biologists can typically afford only a few repetitions. Classical procedures to calculate sample size are based on controlling type I error, e.g., family-wise error rate (FWER). In the context of microarray and other large-scale genomic data, it is more powerful and more reasonable to control false disco...

متن کامل

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

Gene expression: Quick calculation for sample size while controlling false discovery rate with application to microarray analysis

MOTIVATION Sample size calculation is important in experimental design and is even more so in microarray or proteomic experiments since only a few repetitions can be afforded. In the multiple testing problems involving these experiments, it is more powerful and more reasonable to control false discovery rate (FDR) or positive FDR (pFDR) instead of type I error, e.g. family-wise error rate (FWER...

متن کامل

Sample Size Estimation while Controlling False Discovery Rate for Microarray Experiments Using the ssize.fdr Package

Microarray experiments are becoming more and more popular and critical in many biological disciplines. As in any statistical experiment, appropriate experimental design is essential for reliable statistical inference, and sample size has a crucial role in experimental design. Because microarray experiments are rather costly, it is important to have an adequate sample size that will achieve a de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 25 5  شماره 

صفحات  -

تاریخ انتشار 2009